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Embedding Population Dynamics 
Models in Inference 

Stephen T. Buckland, Ken B. Newman, Carmen Fernandez, Len Thomas and John 
Harwood 



Abstract. Increasing pressures on the environment are generating an 
ever-increasing need to manage animal and plant populations sustain- 
ably, and to protect and rebuild endangered populations. Effective man- 
agement requires reliable mathematical models, so that the effects of 
management action can be predicted, and the uncertainty in these 
predictions quantified. These models must be able to predict the re- 
sponse of populations to anthropogenic change, while handling the 
major sources of uncertainty. We describe a simple "building block" 
approach to formulating discrete-time models. We show how to esti- 
mate the parameters of such models from time series of data, and how 
to quantify uncertainty in those estimates and in numbers of individu- 
als of different types in populations, using computer-intensive Bayesian 
methods. We also discuss advantages and pitfalls of the approach, and 
give an example using the British grey seal population. 

Key words and phrases: Hidden process models, filtering, Kalman fil- 
ter, matrix population models, Markov chain Monte Carlo, particle 
filter, sequential importance sampling, state-space models. 
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1. INTRODUCTION 

At the 2002 World Summit on Sustainable De- 
velopment in Johannesburg, political leaders agreed 
to strive for "a significant reduction in the current 
rate of loss of biological diversity" by the year 2010. 
Initial steps to achieve this include developing and 
implementing monitoring programs and data collec- 
tion procedures that quantify the rate of loss of bio- 
diversity (Buckland, Magurran, Green and Fewster, 
?yearBMGF05), allowing assessment of the success 
of management actions. However, monitoring is a 
blunt instrument for management, because of the 
long lag between action and observed effects. Ex- 
planatory mathematical models provide "what if" 
tools for predicting the impact of different manage- 
ment actions on populations and biodiversity be- 
fore a course of action is chosen. Both elements 
are needed: models, to guide and increase the effec- 
tiveness of management action; and monitoring, to 
provide a retrospective measure of whether the pre- 
dicted effects have been achieved. Indeed, the two 
approaches should be fully integrated, by using the 
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data from monitoring programs to update and fine- 
tune the mathematical models. One way to integrate 
the two elements is through the use of adaptive man- 
agement techniques (Walters, 2002). 

The process of creating explanatory mathematical 
models involves several steps, including formulat- 
ing one or more models, fitting the models to data, 
and, in situations of multiple models, selecting or 
averaging models that will be used for prediction 
(Burnham and Anderson, 2002). Realistic models 
that meet the needs of policymakers and biodiver- 
sity managers often have a high degree of complex- 
ity. If these people are not to be misled, it is essential 
that uncertainty in these models is quantified (Har- 
wood and Stokes, 2003; Clark and Bj0rnstad, 2004). 
The uncertainty in modeling population dynamics 
that arises includes four main sources: process vari- 
ation (demographic and environmental stochastic- 
ity), observation error, parameter uncertainty and 
model uncertainty. 

A popular and powerful approach to formulating 
models for plant and animal population dynamics 
has been the use of matrix models (Caswell, 2001). 
Such models explicitly account for the various pro- 
cesses that affect the dynamics of populations, for 
example, survival, birth, movement. However, fitting 
such matrix models to data, for example, estimating 
survival and fecundity rates, has often been carried 
out in a somewhat piecemeal fashion, and uncer- 
tainty about model parameters has not always been 
incorporated into model projections, let alone un- 
certainty associated with the initial choice of the 
particular matrix model. 

In contrast, many statistical models for time series 
of population data can readily be fit in an integrated 
manner and various forms of uncertainty simulta- 
neously accounted for. However, model formulation 
is often empirical, with no attempt to incorporate 
the processes underlying population dynamics ex- 
plicitly, for example, AR(p) and ARIMA models. 

In this paper we review recent developments in 
modeling population dynamics and describe an in- 
tegrated approach to formulating, fitting and select- 
ing realistic models for population dynamics that 
builds upon the matrix model framework. The re- 
sulting models are not necessarily matrix models, 
and are in fact more flexible. The approach is em- 
bedded within a Bayesian inferential framework, so 
that the main sources of uncertainty are accommo- 
dated. Two model fitting approaches are discussed, 
Markov chain Monte Carlo (MCMC) and sequential 



importance sampling (SIS), and an example of the 
approach is given for the British grey seal popula- 
tion. 

2. RECENT DEVELOPMENTS IN MODELING 
POPULATION DYNAMICS 

Matrix population models have long been used 
for describing population dynamics. Following the 
pioneering work of Leslie (1945, 1948), such mod- 
els were usually deterministic; if observational data 
were used at all for fitting models, it was for ad hoc 
estimation of parameters of the population model, 
to allow deterministic projection of the population 
of interest. Caswell (2001) gives a comprehensive ac- 
count of the mathematical development of the topic, 
which includes stochastic extensions (demographic 
and environmental variation), and asymptotic anal- 
yses of growth rates and age and stage class distribu- 
tions for deterministic and stochastic matrix mod- 
els. However, in many cases an integration of field 
or laboratory experiment data with an underlying 
population dynamics model is lacking. Quoting Tul- 
japurkar (1997) following an exposition of analysis 
of stochastic matrix models: "A useful application of 
stochastic models that we have not discussed here is 
the development of methods for estimating the vi- 
tal rates of structured populations from data. Such 
estimation methods . . . are desirable in ecology, al- 
though rarely used." 

An integration of data with population dynamics 
models, and an embedding of such models in statis- 
tical inference, is needed to allow for simultaneous 
accounting of uncertainties about model parameter 
values, observation error, process variation (demo- 
graphic and environmental stochasticity) and model 
uncertainty. State-space models (Harvey, 1989; West 
and Harrison, 1997), particularly when applied in 
a Bayesian framework, are a means of integrating 
data with population dynamics models and readily 
quantifying the various types of uncertainty (Calder, 
Lavine, Miiller and Clark, 2003; Clark, Ferraz, Oguge, 
Hays and DiCostanzo, 2005). A process model speci- 
fies probability distributions associated with the rel- 
evant population processes, such as birth, survival 
and movement, and a corresponding observation model 
defines the probability distribution of observations, 
relating them to states. Using the terminology of 
Caswell (2001, pages 37-38), individuals are clas- 
sified according to i-states (e.g., age category, sex, 
species). The vector of counts (or biomass) of indi- 
viduals by i-state is the p-state of the population, 
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that is, the distribution of i-states in the popula- 
tion. The elements of the state vector in the process 
model may correspond to individuals or to groups of 
individuals. State-space models are first-order 
Markov: the state at time t depends on the state at 
time t—1, but is conditionally independent of ear- 
lier states. Higher-order Markov models are readily 
accommodated, and Newman, Buckland, Lindley, 
Thomas and Fernandez (2006) use the term "hid- 
den process models" to describe the entire range of 
such models. 

Lavine, Beckage and Clark (2002) provide an ex- 
ample of a complex process model for tree seedling 
mortality which could be formulated as a state equa- 
tion. The observation equation in their case is degen- 
erate in the sense that there is (assumed to be) no 
observation error: the true numbers of seedlings are 
recorded on the plots. Thus the observation equa- 
tion would equate observations to the sum of counts 
for indistinguishable types, while the state equation 
would split out these types — hence it is an example 
of a hidden process model. If inference were to be 
drawn on a wider population beyond the sampled 
plots, then the observation equation would not be 
degenerate. 

Many of the key developments in this field have 
been made in the context of fisheries stock assess- 
ment; Quinn and Deriso (1999) give a detailed ac- 
count of these developments. Early methods typi- 
cally modeled either process variation or (more usu- 
ally) observation error (Hilborn and Walters, 1992, 
page 226). An exception was Collie and Sissenwine 
(1983), who modeled a time series of relative abun- 
dance data with observation error to estimate pop- 
ulation size. Process variation was incorporated by 
assuming that survival rate was a random variable. 
Mendelssohn (1988) extended their approach to al- 
low the underlying population dynamics to be ran- 
dom. He obtained maximum likelihood estimates 
of the model parameters using the Kalman filter 
(Kalman, 1960) together with the EM algorithm 
(Dempster, Laird and Rubin, 1977). Gudmundsson 
(1987, 1994), Sullivan (1992) and Schnute (1994) all 
used the Kalman filter to fit state-space models of 
fish dynamics to time series of catch data, and New- 
man (1998) incorporated a spatial component. All 
of these authors adopted linear models (or linear 
approximations to nonlinear models), and assumed 
that process variation and observation errors were 
both normally distributed. 



Nonlinear, nonnormal models have been developed 
within a Bayesian framework. Hilborn, Pikitch and 
McAllister (1994), McAllister, Pikitch, Punt and 
Hilborn (1994) and Schnute (1994) developed Bayesian 
approaches in a fisheries context, and McAllister and 
Ianelli (1997) compared fitting algorithms based on 
sequential importance sampling (SIS, also known as 
particle filtering), Markov chain Monte Carlo (MCMC) 
and adaptive importance sampling. Millar and Meyer 
(2000) developed an MCMC fitting algorithm, which 
Meyer and Millar (1999) implemented in the BUGS 
package, thus making the methods more accessible 
to the user community. Rivot, Prevost, Parent and 
Bagliniere (2004) also used BUGS to fit their model. 
Cunningham, Reid, McAllister, Kirkwood and Darby 
(2007) used the sampling importance resampling (SIR) 
algorithm (Rubin, 1988) to model three mackerel 
stocks, spread through seven geographic regions, with 
deterministic movement between them. 

Outside of fisheries, Raftery, Givens and Zeh (1995) 
used a "Bayesian synthesis" approach to draw in- 
ference from a deterministic population dynamics 
model for bowhead whales. As pointed out by Wolpert 
(1995), their approach suffers from Borel's paradox: 
their results are dependent on the (nonunique) pa- 
rameterization chosen for the population dynam- 
ics model. Subsequently, Poole and Raftery (2000) 
developed a "Bayesian melding" method, which is 
coherent. Trenkel, Elston and Buckland (2000) de- 
veloped a model of red deer population dynamics 
that managers could use to explore the likely ef- 
fects of different culling strategies. They used SIS 
with kernel smoothing to fit the model in a Bayesian 
framework. Besbeas, Freeman, Morgan and Catch- 
pole (2002), Besbeas, Lebreton and Morgan (2003) 
and Besbeas, Freeman and Morgan (2005) used the 
Kalman filter to fit state-space models to a com- 
bination of abundance and demographic data on 
two species of birds (grey heron and northern lap- 
wing). MCMC has also been used to fit state-space 
models for bird populations (Wikle, 2003; Clark and 
Bj0rnstad, 2004) and moose (Clark and Bj0rnstad, 
2004). Thomas, Buckland, Newman and Harwood 
(2005) used SIS to model the dynamics of a spatially 
structured grey seal population, allowing estimation 
of movement rates between regions. Lele (2006) used 
the composite-likelihood method for estimating the 
parameters of the Gompertz model in the presence 
of sampling variability. 

To date, model uncertainty has received little at- 
tention. Bayesian model averaging may be incorpo- 
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rated into importance sampling as noted by Buck- 
land, Newman, Thomas and Koesters (2004) and 
Cunningham et al. (2007). Several models are de- 
fined, and a prior is specified reflecting the initial 
belief on the relative plausibility of each model. In 
the absence of better knowledge, we would assign 
each model equal prior weight. If MCMC is used, 
then reversible jump MCMC (Green, 1995) can be 
used to sample what may be very high-dimensional 
model space using the approach of King and Brooks 
(2002a, b). 

3. MODELING POPULATION PROCESSES 
AND ASSOCIATED INFERENCE 

3.1 Leslie and Lefkovitch Matrices Decomposed 

A Leslie matrix (Leslie, 1945, 1948; Caswell, 2001) 
is a population projection matrix that shows how, 
in the absence of process variation, a population 
updates itself from one year to the next, through 
births, deaths and aging. We use the term "gener- 
alized Leslie matrix" to indicate an age-structured 
population projection matrix with at least one addi- 
tional process, such as movement or sex assignment. 
If, instead of aging, we have stages of development, 
then the population projection matrix is usually re- 
ferred to as a Lefkovitch matrix (Lefkovitch, 1965; 
Caswell, 2001). In the following, we show how such 
matrices can be expressed as products of simpler 
matrices, each one corresponding to a single biolog- 
ical process (Buckland et al., 2004). This approach 
has been used in the context of deterministic matrix 
population models by Lebreton (1973) and Lebreton 
and Isenmann (1976). Hooten, Wikle, Dorazio and 
Royle (2007) used the same strategy to split their 
matrix model for the growth and spread of the US 
population of Eurasian collared-doves into a growth 
process and a movement process. 

Suppose that we have one species of animal, di- 
vided into two age classes, with a total of uq^-i 
newly born individuals and n\ :t -i adults in the pop- 
ulation at the end of year t — 1. Suppose further that 
the population is subject to just three processes: sur- 
vival through the next year, age incrementation and 
births. For simplicity, we model the females in the 
population only. The expected number of survivors 
to the end of the next year can then be written 

/ -E(s ,t) \ = c( no,t-i \ (<f>o \ / n ,t-i \ 



where (j>o is the probability of survival of newly born 
individuals to the end of their first year, <p\ is the an- 
nual survival probability of adults and S is a survival 
projection matrix. We can introduce demographic 
stochasticity by specifying 

( s 0>t ~ binomial (n ,i_i , 4>o) \ 
\sij ~ binomial (ni i4 „ i, 0i )y ' 

We could also introduce environmental stochastic- 
ity by allowing the survival probabilities </>o and 4>\ 
to vary at random between years. Survival proba- 
bilities could also be modeled as functions of co- 
variates, for example, using logistic functions, as in 
Buckland et al. (2004). If some of these covariates 
relate to the individual, such as body mass or con- 
dition, then i-states (the states of individuals in the 
population) can be modeled. If relevant covariates 
are unavailable, but survival probabilities are vari- 
able, they could be modeled using random effects. 

Age incrementation is deterministic, so that the 
numbers of individuals immediately before births 
take place are 




where A is an aging projection matrix. Finally, we 
can specify the birth process by writing 

[E(n 0>t )\ f W1AW \ 

v ni >t J WtJ \o iy Uw' 

where B is a birth projection matrix, with a suit- 
able stochastic model for births, for example, reo,t ~ 
Poisson(Aai^). (Note that the first element of the 
birth matrix could equivalently be set to zero, since 
after aging and before new births occur, there are 
no young in the population; we set it to unity in 
anticipation of the growth model below.) 

Thus, conditional on numbers at the end of year 
t — 1, the expected numbers of individuals present 
at the end of year t are 

f E(n ,t)\ = (1X\ [00\ (<f>o \ (n ,t-i\ 
\E(n 1:t )J \01J\11J \ fa) \n lt t-ij 

_ ( A^o \4>\\ ( n ,t-i\ 
Wo 4>i ) \ni jt -i) ' 

We denote this Model 1. The vector n.( = P -*) is 
termed the p-state or simply the state vector, be- 
cause it tallies the number of individuals in each 
state (mature or immature here). The product of 
matrices is an example of a simple Leslie matrix. 
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Note, however, that only survivors can breed; in the 
traditional Leslie matrix, the first row would not 
contain the survival parameters, so that the birth 
rate applies to all individuals, whether or not they 
survive until the breeding season. By varying the se- 
quencing of the processes that alter the population, 
one can readily construct variations on classic Leslie 
matrix formulations. Note that the above equation 
may be expressed in matrix form as E(p.t\nt—i) = 
Pnt_i, with the Leslie population projection matrix 
P = BAS = ( A ( f £). 

Suppose we wish to model two growth stages rather 
than two age classes, and the probability that an in- 
dividual moves from stage 1 to stage 2 is ir. Then 
the age incrementation matrix A= j) is replaced 
by the growth matrix G = ( 1 ~ 7r 1) and the product 
of the three matrices is now a Lefkovitch population 
projection matrix: P = BGS = (^"^g^ • De- 
note this by Model 2. 

We can now extend Model 1 to have two subpop- 
ulations with movement rate /i between them; we 
label this Model 3. If movement occurs just before 
the breeding season, then 
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where noj t t indicates number of immature individu- 
als in subpopulation j in year t, j = 1,2, and sim- 
ilarly for adults. This may be expressed in matrix 
form as E(p.t\nt—i) = BAMSri£_i, where the birth 
(B), aging (A), movement (M) and survival (S) pro- 
jection matrices are as above. The product of ma- 
trices is now a generalized Leslie matrix: 

P = BAMS 

'A</> (1 - n) A0i(l - n) \(j)QH A0i/i 
0o(l-/i) 0i(l-/x) fay, <j>ip 
A(/>oM MiH A0 O (1 ~ M) A0i(1 - n) 
(j>o[J, fan (j>o{l-n) <j>\{l-n) 

Diagrammatic representations of Models 1-3 are given 
in the Appendix. 



Caswell (2001, page 110) considers an additive 
decomposition of the projection matrix, in which 
one of the matrices describes reproduction and the 
other transitions. The matrix for reproduction com- 
prises elements that represent the expected num- 
ber of offspring to individuals, where columns rep- 
resent the type of individual, and rows the type of 
offspring. The transition matrix comprises elements 
that represent the probability that an individual 
of type j (represented by columns) at time t — 1 
is in the population and of type i (represented by 
rows) at time t. For the above multiplicative de- 
compositions, each process can be specified by its 
own matrix, whereas in Caswell's additive decom- 
position, each process must be included as a com- 
ponent of either reproduction or transition. Thus for 
a model with growth stages and two sexes, survival 
and growth processes both appear in the transition 
matrix, and birth and sex assignment are both han- 
dled in the reproduction matrix. The multiplicative 
decomposition therefore offers an easier formulation 
of the model. It also explicitly orders processes in 
time, whereas the additive decomposition does not. 
The latter approach therefore shares a disadvantage 
with the standard Leslie matrix: the breeding rate 
applies to all individuals, whether or not they sur- 
vive to the breeding season. (If the year is redefined 
to start just before the breeding season, then first- 
year survival cannot be modeled using an additive 
decomposition.) A further disadvantage of the addi- 
tive decomposition is that it does not allow the num- 
ber of states within the population to vary through 
the year, whereas this is readily modeled in the se- 
quential multiplicative framework (Buckland et al., 
2004), allowing greater flexibility. 

3.2 The State Equation 

Denote the state vector after process k has oc- 
curred in year t by Ufc t , where k = 1,...,K, say, 
and t = 1, . . . , T. Then u K ,t = n t . For Model 1, for 
example, K = 3; ui^ represents numbers of survivors 
in each i-state through to the end of year t, t^t rep- 
resents numbers of adults after age incrementation, 
and = rij. The state equation for process k in 
year t specifies how the states are updated. Assum- 
ing the process is first-order Markov (an assumption 
that may easily be relaxed), Ufc jt = Pfc,t(u.fc-i,t) for 
k = 1, . . . ,K, where u ,t = n t _i = u K ,t-i, and Pfc,t(-) 
is an operator to be specified, which is random if the 
process is stochastic. For the general case, the state 
equation may be expressed as 

(1) n t = P t (n t _i) J 
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where Pf(-) is the composition 

Pt(-)=P*,t(P*-i,t(---Pi,t(- 



•))• 



Matrix population models in which all processes 
are deterministic arise as special cases. Then we can 
write 

n t = P t n t _i, 

where P t = PK,t^K-i,t • ■ ■ Pi,t is typically a gener- 
alized Leslie or Lefkovitch matrix. Note that P^ may 
be a function of ri£_i, allowing models that are non- 
linear in the states, so that, for example, rates can 
be a function of population size (Caswell, 2001). A 
second special case is when at least one process is 
stochastic, but the expected values of the elements 
of may be expressed as functions of the elements 
of n^_i. Then 



(2) 



E(n t |n t _i) = P t n f _i, 



where Pt is a population projection matrix such that 
•E(P t (n t _i)[n t _i) =P t n t -i l Examples are given by 
Models 1-3. For Model 1, Pi t(") is a random vari- 
able generated from a binomial distribution corre- 
sponding to each i-state, P2,t( - ) is deterministic, and 
P3 1(") has a stochastic Poisson-distributed compo- 
nent and a deterministic one. In this case, 
£?tP 3 ,t(P2,t(Pi,t(nt-l)))|nt-l] = BASn t _i so that 
Pi = BAS. We stress that the model fitting pro- 
cedures described here do not require (2) to hold 
and are applicable in the wider context of (1). 

For a full specification of the process model, we 
need to choose a probability density function (p.d.f.) 
for the initial state vector, say go(pa\0), and a p.d.f. 
for the state vector at time t given the state vector 
at time t — l,gt(nt\nt-i,9), where 9 is the vector 
of unknown parameters. This p.d.f. is determined 
by the K p.d.f. 's corresponding to each of the pro- 
cesses in (1) (Buckland et al., 2004). We note that, 
in the case of multiple processes, the resulting p.d.f. 
is a composite function that can also include convo- 
lutions. Direct evaluation of the p.d.f. can thus be 
quite difficult without the inclusion of intermediate 
latent states (Clark et al., 2005; Newman, Fernandez, 
Buckland and Thomas, 2007); Lele (2006) addresses 
a similar problem within a likelihood framework. 

3.3 The Observation Equation 

The observation equation relates the observation 
vector yt to the states at time t through an opera- 
tor O t (-), so that y t = O t (n t ). We denote the cor- 
responding observation p.d.f. by ft(yt\nt,9). If the 



operator is random but linear, then E(yt\nt) = O^n^ 
for the appropriate matrix O^. For example, if for 
Model 3 we have an estimate yj t of total population 
size just after breeding for subpopulation j, with 
corresponding variance cr| t , j = 1,2, and the two 
estimates are independently normally distributed, 
then 



E(yt\n t ) 



(E(yi, t ) 



1100 
00 11 



n>ii,t 
\ni2,tJ 



and 



ft(yt\n t ,e) = l[ 



2 ™h 



cxp 



{Vj,t ~ E(y jtt )} 



2 



In a Bayesian context, a prior distribution for a 
may be based on an estimate <x? t , together with its 
estimated precision. 

Unknown parameters may feature in the projec- 
tion matrix Oj. In the above example, if the obser- 
vations were incomplete counts, with probability p 
that any individual animal was counted, we might 
have 



E(y t \n t ) 



(E(yi,t) 
\E(V2,t) 



ppO 
Opp 



fn i,t\ 
nn,t 
no2.t 

\ni2,tJ 



and 



ft(y t \n t ,e 

2 



n 



noj,t + nij,t 
Vj,t 



pVi,t(\ - p^ n 0j,t+ni;j,t-yj,t _ 



Given additional independent data for estimating p, 
the observation p.d.f. may be expanded by multiply- 
ing by the likelihood corresponding to those data. 

3.4 Bayesian Inference 

Within a Bayesian framework, denote the prior 
distribution of the parameters 9 by g(9). Then the 
joint prior distribution for the state vector n t and 
the parameters 9 is 

T 

g(9) x g (n \9) x JJ 5t (n t |n t _i,0), 

t=\ 

and the likelihood is 

T 

Y[ft(yt\n t ,9). 

t=i 
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Hence the posterior distribution is 
Sr(n o ,...,n T ,0|yi,...,y T ) 



(3) 



g(0) x 5o(n o |0) 



n{ 5t (n t |ni_i,0) X ft(yt\n t ,e)} 

t=i j 



x(/(y 1 ,...,y T ))- 1 . 

This approach for modeling complex environmental 
and ecological processes was espoused by Berliner 
(1996), Wikle, Berliner and Cressie (1998) and Wikle 
(2003). 

At time t <T, the following types of inference are 
often useful: filtering, using g(n t ,0\yi, . . . ,y t ) (i.e., 
only observations up to time t are used); smooth- 
ing, using g(n t , 6\yi, ■ ■ ■ , yr) (i-e., the full time se- 
ries of observations up to time T is used to estimate 
the state vector at time t); and prediction, using 
g(n t /,0|yi, . . . ,y t ) (i.e., observations up to year t are 
used to predict the state vector in year t' , t' > t). 

3.5 Model Fitting 

SIS and MCMC are both computer-intensive 
Monte Carlo methods, useful for fitting complex mod- 
els of the type described above, usually within a 
Bayesian framework. Liu (2001) describes both SIS 
and MCMC, and Gilks, Richardson and Spiegelhal- 
ter (1996) and Doucet, de Freitas and Gordon (2001) 
include a range of articles on MCMC and SIS, re- 
spectively. 

Newman et al. (2006, 2007) give full details of how 
the two approaches may be applied to fit models 
that fall within the above framework. Because SIS 
is less widely used than MCMC, we start with a brief 
description of it. 

3.5.1 Sequential importance sampling. In its sim- 
plest form, a large number R of "particles" are sim- 
ulated from the prior distribution g(no,0) = g(0) x 
<7o( n o|$)- The rth "particle" represents a single prior 
realization r of the parameters of the population 
model, together with a single realization of the pop- 
ulation in the initial year, no, r - Each particle is pro- 
jected forward to time t = 1, by simulating the state 
vector ni jr from gi(ni|no jr , r ) for all r. Particles 
are resampled with weights proportional to the con- 
tribution to the likelihood of any observations at 
t = 1. Thus the weights are w r = /i(yi|ni jr ,0 r )/ 



J2?=i /i(yi \ n i,r, r ). This resampling, known as boot- 
strap filtering, was introduced by Gordon, Salmond 
and Smith (1993). The surviving particles (0 r ,no, r , 
ni jT .) are approximately a sample from the posterior 
distribution, given the data at time t = l. They may 
be projected forward to t = 2, with the state vector 
simulated from <?2(n2|ni ir , no, r , Or), and so on, un- 
til resampling at time T has been carried out. The 
surviving particles are then an approximate sample 
from the posterior distribution given by (3). 

A practical problem with the above implementa- 
tion of SIS is "particle depletion": typically, only 
a tiny proportion of original particles survives, and 
most of those appear in the posterior sample many 
times. Consequently, Monte Carlo variation in the 
sample posterior distribution between different sim- 
ulations can be quite high for this simplest form 
of sequential importance sampling. The problem is 
more acute for the static parameters and the states 
corresponding to the early years. Various tricks are 
used to mitigate the effects of particle depletion, 
such as sampling from a proposal distribution that 
more closely matches the posterior than does the 
prior (Liu and Chen, 1998), kernel smoothing (West, 
1993a, 1993b; Trenkel, Elston and Buckland, 2000), 
the auxiliary particle filter (Pitt and Shephard, 1999; 
Thomas et al., 2005), residual sampling and partial 
rejection control (Liu, 2001). 

3.5.2 Markov chain Monte Carlo. This method 
generates dependent samples from the posterior dis- 
tribution in (3), by simulating a Markov chain whose 
stationary distribution is the required posterior dis- 
tribution. A particular strength of this method is 
that it allows the typically large dimensionality of 
the posterior distribution to be broken down, by 
simulating blocks of variables (parameters or states) 
in turn, conditioning on the current values of the 
other variables in the chain. An efficient implemen- 
tation of MCMC for fitting population dynamics 
models depends critically on the choice of blocking 
scheme for the parameters and states, and of pro- 
posal distributions (Gilks and Roberts, 1996). Repa- 
rameterization can be very useful in some cases. 

3.6 Extending the Model Framework 

The modeling framework captured by (1) is very 
flexible, and goes far beyond the simple examples 
presented in Section 3.1. We now briefly discuss some 
of the possibilities. 
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For moderate and large populations, demographic 
stochasticity may be negligible compared to envi- 
ronmental stochasticity. Vital rates (i.e., birth and 
survival rates) can be modeled as (autocorrelated) 
random variables to account for environmental 
stochasticity or trends in habitat quality (Rivot et al., 
2004). Vital rates could also vary in a completely 
random manner (i.i.d. sequences; Newman, 2000), 
evolve according to a discrete-time Markov chain, 
or follow an autoregressive-moving-average process 
(Caswell, 2001, pages 378-379; see also Johnson and 
Hoeting, 2003). Further, hierarchical models can be 
used that include random effects for sampling cor- 
related processes (Clark et al., 2005). More detailed 
models could specify vital rates as functions of co- 
variates, such as the environment, habitat or age, 
which could vary stochastically. These covariates 
could also reflect deliberate management actions, 
or accidental anthropogenic changes in the environ- 
ment, allowing their effects on population abundance 
to be predicted. Inclusion of covariates that relate 
to i-state allows modeling of individuals. Modeling 
vital rates as random effects is another way to allow 
for individual variation. Density-dependent effects 
(i.e., effects that reduce birth and/or survival rates 
as the population gets larger) can be introduced by 
making vital rates a function of population size. Har- 
vesting and assignment of sex and genotype to new- 
born animals can also be incorporated. 

At a more ambitious level, environmental and de- 
mographic processes can be modeled together, rather 



than simply using environmental variables as covari- 
ates in models for population-level parameters. For 
example, Wikle, Berliner and Cressie (1998) devel- 
oped a hierarchical space-time model for monthly 
temperatures and Wikle (2003) developed a hierar- 
chical space-time model for the spread of the house 
finch. In principle, assuming that temperatures af- 
fect house finch survival and subsequent spread, a 
combined hierarchical space-time model for monthly 
temperatures and house finch abundance is possible. 

Spatial structure can also be modeled, by extend- 
ing the simple movement model represented by 
Model 3; the movement rate may also be modeled 
as a function of animal density and/or covariates 
(Buckland et al., 2004; Thomas et al, 2005). Simi- 
larly, we can include additional species in the model 
by extending the state vector to include entries cor- 
responding to all species. Survival probabilities and/or 
birth rates of one species can then be modeled as 
functions of the abundance of the other species, as 
in discrete-time predator-prey, consumer-resource or 
community models (Gurney and Nisbet, 1998). 

Many of these possibilities can be viewed as ex- 
tensions to standard matrix models, or to stochas- 
tic linear models as in (2). Table 1 contains some 
examples, all of which are "hidden" process mod- 
els because we do not directly observe the states 
that arise from each process (e.g., numbers of deaths 
and survivors as a result of the survival process) . In- 
stead, we observe or estimate certain states at cer- 
tain points in time (e.g., number of females present 



Table 1 

Representations of hidden process models as matrix population models 



Randomly varying vital rates: 

Vital rates as functions of covariates: 

Density dependence: 

Metapopulations: 




Pll Pl2 
P 2 1 P22 



where the off-diagonal submatrices handle movement between subpopulations. 



Multiple species: 



E(n t |n t _i)=£ 



n a ,t 



n a ,t-i 



J \ 



I n a , t _i 



/Pii(nt-i) 

P 22 (n t _i) 

V 



/ n a ,t-i 

V ; 



"The matrix Pt is the population projection matrix, showing how expectations of states n t conditional on n t _i relate to n.t_i. 
The matrix P t might have elements that are expectations of functions of the vital rates (indicated by P*)i or might depend 
on covariates xt [denoted P(x t )] or states n t _i [denoted P(n t _i)]. More complex models can be obtained by combining the 
different model types. Further, Pt might depend on intermediate states, corresponding to a time point between t — 1 and t, 
in which case the expectations do not hold and the models should be interpreted in the more general context of (1). 



INFERENCE FROM POPULATION MODELS 



9 



at breeding colonies). Inferences about all the under- 
lying hidden processes, such as plausible parametric 
forms for models of the processes, and estimates of 
the corresponding parameters, can be drawn from 
these observations (Harwood and Stokes, 2003; 
Thomas et al., 2005; Newman et al., 2006). 

As Clark and Bj0rnstad (2004) note, missing ob- 
servations, for example as the result of uneven sam- 
pling intervals, are readily accommodated by the 
state-space framework. Because we split the annual 
population processes into chronological order, our 
framework is readily extended to allow observations 
at different times of the year. Changing effort, an- 
other issue considered by Clark and Bj0rnstad (2004), 
is also easily accommodated in this framework: cap- 
ture or detection probability can be modeled in the 
observation equation as a function of effort, or counts 
can be adjusted for effort, so that the adjusted counts 
(which may be absolute or relative abundance esti- 
mates) are entered into the observation equation, 
along with their estimated precision. Dupuis (1995) 
and Clark et al. (2005) modeled structured popu- 
lations in the context of mark-recapture, and Clark 
et al. (2005) noted that other data models could 
be used with their methods. In our framework, as 
with Clark et al. (2005), the structure appears in 
the state vector, and transitions are modeled in the 
state equation. For mark-recapture, if the mark sta- 
tus of an animal is included in the state vector, then 
the capture model is incorporated into the state 
equation (Buckland et al., 2004), and the obser- 
vation equation is degenerate: the numbers of ani- 
mals with capture histories ending with capture are 
known without error. If, on the other hand, the cap- 
ture process is considered to be part of the obser- 
vation process (in which case inference is not condi- 
tional on known numbers of individuals captured), 
it is modeled in the observation equation. Popula- 
tion processes such as survival, births and transi- 
tions (e.g., movement) are handled in the state equa- 
tion, allowing a population dynamics model to be 
"embedded" in the mark-recapture analysis. 

3.7 Limitations 

The models described here operate in discrete time 
and space, and the population is partitioned into 
discrete states. Movement models that are contin- 
uous in space can be incorporated, by specifying a 
probability density function for distance and direc- 
tion moved, given an animal's current state, location 



and other relevant covariates. Extension to contin- 
uous time is less straightforward. However, we can 
readily decrease the time intervals that are mod- 
eled to reduce the approximations implicit in dis- 
cretization. Further, for time intervals in which only 
a single process (e.g., survival) operates, no approx- 
imation is required — in effect, we integrate out time 
over the period. For example, if survival and move- 
ment processes operate synchronously, we can spec- 
ify models for the instantaneous death and move- 
ment rates, and use simulation-based methods to 
embed these continuous-time models in the above 
structure. However, the benefits of this approach 
(in terms of improved correspondence with reality) 
would have to be weighed against the additional 
computational burden. 

The same approach can be used to model commu- 
nities of species, but our ability to fit such models 
is hampered by a lack of knowledge and of data on 
species interactions. For example, one species may 
act as both predator and prey of another at differ- 
ent stages of their life cycles. Similarly, knowledge is 
often lacking about the form of the relationships be- 
tween biological processes and environmental covari- 
ates. However, technological advances in data log- 
ging offer greater potential for gathering data that 
can be used to infer such relationships. As computer 
power increases, we can, for example, contemplate 
the development of community models that incorpo- 
rate the important processes, while accommodating 
the major sources of uncertainty. 

4. EXAMPLE: MODELING A 
METAPOPULATION 

Thomas et al. (2005) developed a hidden process 
model of the metapopulation dynamics of British 
grey seals. These animals spend over 80% of their 
time at sea (McConnell, Fedak, Lovell and Hammond, 
1999), and 90% of this time underwater (Thompson, 
Hammond, Nicholas and Fedak, 1991), so that it is 
difficult to survey the entire population. However, 
the species breeds colonially and pups spend most 
of the first three weeks of their lives ashore, where 
they are readily counted. Aerial pup counts have 
been conducted at all of the major Scottish breed- 
ing colonies in every year since 1984. These counts 
are used to estimate the total number of pups born 
at each colony in each year. 

To illustrate the ideas of this paper, we modeled 
the population of female seals for the period 1984- 
2002 by aggregating the colonies into four geograph- 
ically distinct regions: North Sea (4 colonies) Inner 
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Fig. 1. Model for British grey seals. For clarity, only one region is shown. Dashed lines indicate stochastic processes, solid 
lines deterministic processes. 



Hebrides (19 colonies), Outer Hebrides (11 colonies) 
and the Orkneys (22 colonies). We used this model 
to investigate whether it was more likely that the dif- 
ferential trends in pup production observed among 
the four regions were due to density-dependent sur- 
vival and movement, or to deliberate killing of seals 
to protect salmon farms. 

The model is shown diagrammatically in Figure 1 . 
We can come close to expressing it in the form of a 
matrix population model by writing ^(n^n^i) — 
BM^AStrit-i, where and nt_i are column vec- 
tors listing number of individuals in each of seven 
age categories and four geographical regions. The 
matrix B represents reproduction, with the same 
birth rate for all mature females (age 6 or more); Mt 
represents movement of seals aged 5 (those recruit- 
ing to the breeding population) between regions, 
which is modeled as a function of relative densi- 
ties and distances between sites; A represents the 
annual deterministic process of aging; and St is a 
diagonal matrix of survival probabilities, which are 
modeled ELS cl function of either pup production or 
salmon farming activity. Note that the movement 
matrix is density dependent, and the survival 



matrix St is either density dependent or a function 
of salmon farming activity, so that both vary by 
year, whereas the aging matrix A is certainly time 
independent, and the birth matrix B is assumed to 
be. The product Pj = BMj ASj is the generalized 
Leslie matrix, a population projection matrix. How- 
ever, the nonlinearity arising from the density de- 
pendence in Mt invalidates this matrix representa- 
tion of the model, which needs to be interpreted in 
the more general framework of (1). The density de- 
pendence in St would be allowable in the matrix 
representation framework, because it is modeled us- 
ing D.t—1} whereas the density dependence in M^ is 
modeled using the intermediate state vector U2,t. 

Pup production estimates were assumed to be nor- 
mally distributed, and sequential importance sam- 
pling was used to fit the models. Two measures of 
salmon farming activity were considered: annual 
salmon production by area, and total staff numbers 
by area. In the matrix notation, St now becomes 
S(xt), where is either salmon production or total 
staff numbers, or S(nt_i) for the model in which sur- 
vival is a function of pup production. Annual salmon 
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Fig. 2. Estimates of pup production for model without density dependence and movement, but with annual salmon production 
as a covariate for modeling survival. Input data are shown by open circles. Mean estimates of pup numbers from the model 
are shown, together with the 2.5 and 97.5 percentiles of our estimates. 



production, with a common parameter across re- 
gions for its possible effect on seal survival, provided 
the best fitting model, as judged by Akaike's Infor- 
mation Criterion. 

Smoothed estimates of numbers of pups in each of 
the regions are shown in Figure 2 for the model with- 
out density dependence and movement, but with 
annual salmon production as a covariate. This sim- 
plistic model is able to fit the time series of pup 
production estimates reasonably well. However, the 
estimated levels of deliberate killing are implausi- 
bly high, with a best estimate of around 4000 seals 
shot annually in recent years in the Outer Hebrides. 
This level of mortality is unlikely to occur unde- 
tected, suggesting that the differences in trends in 
pup production among regions are not due solely to 
anthropogenic killing. 

5. DISCUSSION 

Ecologists often have a good understanding of the 
processes regulating the abundance of particular an- 
imal or plant populations, but the statistical meth- 
ods that have traditionally been used to model time 



series of abundance estimates do not exploit this 
knowledge. Consequently, the resulting models are 
ineffective for predicting future changes. They also 
provide no mechanism for exploring the impact of 
different management actions, such as harvest strate- 
gies or reintroductions of a species into its former 
range. Explicit models of population processes are 
needed for this kind of prediction. However, the un- 
certainty associated with these predictions will not 
be adequately quantified if these models are not 
fully embedded into inference. Hence, the risks as- 
sociated with different management strategies will 
be unknown. Methods of the type described here 
are therefore essential for effective management of 
species and of the ecosystems to which they belong. 

In the section on limitations, the issues of con- 
tinuous-time processes and modeling of communities 
were raised. These topics offer many research oppor- 
tunities. Guidelines or rules of thumb for model for- 
mulation with associated measures of the extent of 
nonidentifiability or weak identifiability are needed. 
For example, given the observation and state vec- 
tors, which components of the state vector and which 
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parameters are identifiable? Quoting Harvey (1989, 
page 205): "The question of identifiability is a fun- 
damental one in statistical modeling. It is particu- 
larly important in the context of unobserved compo- 
nents modeling since it is very easy to set up models 
which are not identifiable." Methods exist for lin- 
ear regression models (e.g., variance inflation factors 
and condition numbers of the correlation matrix for 
predictors; Myers, 1990), for mark-recapture models 
(Catchpole and Morgan, 1997; Catchpole, Morgan 
and Freeman, 1998) and for time-invariant struc- 
tural models (Harvey, 1989), but similar methods 
have not yet been developed for nonlinear, nonsta- 
tionary state-space models. 

Methods for fitting nonlinear, non-Gaussian state- 
space models are the subject of much research (e.g., 
Doucet, de Freitas and Gordon, 2001). Choices in- 
clude MCMC, SIS and other sequential Monte Carlo 
procedures. Whereas these are general-purpose com- 
putational methodologies, the details of any specific 
algorithm are bound to be quite complicated. The 
dimension of the posterior distribution is typically 
very high, as it includes the static parameters and 
the state vectors for all years. At the same time, 
there is often not enough data information to iden- 
tify all unknowns clearly, leading to strong corre- 
lations in the posterior distribution. The combina- 
tion of high dimensionality with strong correlations 
is certainly challenging from a computational per- 
spective and there are research opportunities here. 
In addition, practitioners would benefit from guid- 
ance on when to choose one procedure over another, 
and from the development of software for largely 
automated fitting of state-space models. WinBUGS 
(MRC Biostatistics Unit, Cambridge, UK), which 
uses MCMC, has been successfully used for some 
complicated state-space models for ecological data 
(Rivot et al., 2004), but if correlations between pa- 
rameters and states are high, convergence can be 
extremely slow (Newman et al., 2007). 

Finally, state-space model formulation and infer- 
ence need to be fully integrated with what Caswell 
calls a "complete demographic analysis" that in- 
cludes asymptotic dynamics, transient dynamics and 
perturbation analysis. 

APPENDIX: DIAGRAMMATIC 
REPRESENTATIONS OF THE MODELS 

Diagrams in Figures 3-5 above are similar to life 
cycle graphs (Caswell, 2001, pages 56-59). We use 



dashed arrows to indicate stochastic processes, and 
solid arrows to indicate deterministic processes. Also 
shown are the rates associated with the stochastic 
processes: 4>q and fa are the survival rates of young 
and adults, respectively, and A is the mean number 
of births per adult. For Model 2, tt is the annual 
rate of change from stage 1 to stage 2. These rates 
may themselves be modeled, to allow dependence on 
environmental variables, population size, numbers 
of predators, resources, and so on. We can represent 
Model 1 as Figure 3. Model 2 may be represented 
by Figure 4. And finally Model 3. Rates have been 
omitted from Figure 5 to aid clarity. 
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